Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase the timeout when batch deleting permissions #12276

Merged
merged 2 commits into from
Sep 12, 2024

Conversation

dcorbacho
Copy link
Contributor

Deleting a vhost with many user/topic permissions (10k-100k users with one permission each) is very slow in the current implementation of Khepri. khepri_tree:eval_keep_while_conditions is particularly slow, mainly on OS X but it is also a bit slower in Linux (tested in Ubuntu) than the equivalent Mnesia operation to delete the permissions.

This PR allows the rabbitmqctl delete_vhost <vhost> command to finish without a timeout, even if still equally slow.

Copy link
Contributor

@kjnilsson kjnilsson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ofc, this kind of perf isn't ideal as a slow operation like this would block the meta data store for the duration of it's execution which means other operations that started after may also time out. Increasing the timeout is a necessary band aid for now.

Copy link
Member

@the-mikedavis the-mikedavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first thing deleted in rabbit_vhost:delete/2 so calling rabbitmqctl delete_vhost myvhost against a node in a minority will hang with this change. I think that's ok but for cluster_minority_SUITE we will need to change the delete_vhost case to pass a timeout. Something like:

delete_vhost(Config) ->
    ?assertError(
      {erpc, timeout},
      rabbit_ct_broker_helpers:rpc(
        Config, 0,
        rabbit_vhost, delete, [<<"vhost1">>, <<"acting-user">>], 1_000)).

@michaelklishin michaelklishin merged commit caad8a5 into main Sep 12, 2024
199 checks passed
@michaelklishin michaelklishin deleted the slow-vhost-deletion-many-permissions branch September 12, 2024 13:09
michaelklishin added a commit that referenced this pull request Sep 12, 2024
Increase the timeout when batch deleting permissions (backport #12276)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants